Adjusting for chance clustering comparison measures

S Romano; NX Vinh; J Bailey; K Verspoor

Journal article

Adjusting for chance clustering comparison measures

S Romano, NX Vinh, J Bailey, K Verspoor

Journal of Machine Learning Research | MICROTOME PUBL | Published : 2016

Abstract

Adjusted for chance measures are widely used to compare partitions/clusterings of the same data set. In particular, the Adjusted Rand Index (ARI) based on pair-counting, and the Adjusted Mutual Information (AMI) based on Shannon information theory are very popular in the clustering community. Nonetheless it is an open problem as to what are the best application scenarios for each measure and guidelines in the literature for their usage are sparse, with the result that users often resort to using both. Generalized Information Theoretic (IT) measures based on the Tsallis entropy have been shown to link pair-counting and Shannon IT measures. In this paper, we aim to bridge the gap between adjus..

View full abstract

University of Melbourne Researchers

Karin Verspoor Author

Related Projects (1)

Smart comparison and assessment of prediction models for better health using next generation data mining

Prediction models can be used to provide early warning of events, such as adverse medical outcomes. This project will develop principles for..

Grants

Funding Acknowledgements

James Bailey's work was supported by an Australian Research Council Future Fellowship. Experiments were carried out on Amazon cloud supported by AWS in Education Grant Award.